Search results for "Massively parallel"

showing 10 items of 23 documents

An autonomous petrological database for geodynamic simulations of magmatic systems

2022

SUMMARY Self-consistent modelling of magmatic systems is challenging as the melt continuously changes its chemical composition upon crystallization, which may affect the mechanical behaviour of the system. Melt extraction and subsequent crystallization create new rocks while depleting the source region. As the chemistry of the source rocks changes locally due to melt extraction, new calculations of the stable phase assemblages are required to track the rock evolution and the accompanied change in density. As a consequence, a large number of isochemical sections of stable phase assemblages are required to study the evolution of magmatic systems in detail. As the state-of-the-art melting diag…

010504 meteorology & atmospheric sciencesDatabaseFunction (mathematics)Parameter space010502 geochemistry & geophysicscomputer.software_genre01 natural sciencesGeophysicsGeochemistry and Petrology13. Climate actionPhase (matter)Principal component analysisProbability distributionComputational problemCluster analysiscomputerMassively parallel0105 earth and related environmental sciencesGeophysical Journal International

researchProduct

Massively Parallel ANS Decoding on GPUs

2019

In recent years, graphics processors have enabled significant advances in the fields of big data and streamed deep learning. In order to keep control of rapidly growing amounts of data and to achieve sufficient throughput rates, compression features are a key part of many applications including popular deep learning pipelines. However, as most of the respective APIs rely on CPU-based preprocessing for decoding, data decompression frequently becomes a bottleneck in accelerated compute systems. This establishes the need for efficient GPU-based solutions for decompression. Asymmetric numeral systems (ANS) represent a modern approach to entropy coding, combining superior compression results wit…

020203 distributed computingComputer science020206 networking & telecommunicationsData_CODINGANDINFORMATIONTHEORY02 engineering and technologyParallel computingCUDAScalability0202 electrical engineering electronic engineering information engineeringCodecSIMDEntropy encodingMassively parallelDecoding methodsData compressionProceedings of the 48th International Conference on Parallel Processing

researchProduct

WarpDrive: Massively Parallel Hashing on Multi-GPU Nodes

2018

Hash maps are among the most versatile data structures in computer science because of their compact data layout and expected constant time complexity for insertion and querying. However, associated memory access patterns during the probing phase are highly irregular resulting in strongly memory-bound implementations. Massively parallel accelerators such as CUDA-enabled GPUs may overcome this limitation by virtue of their fast video memory featuring almost one TB/s bandwidth in comparison to main memory modules of state-of-the-art CPUs with less than 100 GB/s. Unfortunately, the size of hash maps supported by existing single-GPU hashing implementations is restricted by the limited amount of …

020203 distributed computingComputer scienceHash function0102 computer and information sciences02 engineering and technologyParallel computingData structure01 natural sciencesHash tableElectronic mailMemory management010201 computation theory & mathematicsScalability0202 electrical engineering electronic engineering information engineeringMassively parallelTime complexity2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

researchProduct

Massively Parallel Huffman Decoding on GPUs

2018

Data compression is a fundamental building block in a wide range of applications. Besides its intended purpose to save valuable storage on hard disks, compression can be utilized to increase the effective bandwidth to attached storage as realized by state-of-the-art file systems. In the foreseeing future, on-the-fly compression and decompression will gain utmost importance for the processing of data-intensive applications such as streamed Deep Learning tasks or Next Generation Sequencing pipelines, which establishes the need for fast parallel implementations. Huffman coding is an integral part of a number of compression methods. However, efficient parallel implementation of Huffman decompre…

020203 distributed computingComputer sciencebusiness.industryDeep learning020206 networking & telecommunicationsData_CODINGANDINFORMATIONTHEORY02 engineering and technologyParallel computingHuffman codingsymbols.namesakeCUDATitan (supercomputer)0202 electrical engineering electronic engineering information engineeringsymbolsArtificial intelligencebusinessMassively parallelData compressionProceedings of the 47th International Conference on Parallel Processing

researchProduct

mD3DOCKxb: An Ultra-Scalable CPU-MIC Coordinated Virtual Screening Framework

2017

Molecular docking is an important method in computational drug discovery. In large-scale virtual screening, millions of small drug-like molecules (chemical compounds) are compared against a designated target protein (receptor). Depending on the utilized docking algorithm for screening, this can take several weeks on conventional HPC systems. However, for certain applications including large-scale screening tasks for newly emerging infectious diseases such high runtimes can be highly prohibitive. In this paper, we investigate how the massively parallel neo-heterogeneous architecture of Tianhe-2 Supercomputer consisting of thousands of nodes comprising CPUs and MIC coprocessors that can effic…

0301 basic medicineVirtual screeningMulti-core processorCoprocessorComputer sciencebusiness.industryParallel computingSupercomputer03 medical and health sciences030104 developmental biologyEmbedded systemScalabilityTianhe-2Algorithm designbusinessMassively parallel2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)

researchProduct

SWhybrid: A Hybrid-Parallel Framework for Large-Scale Protein Sequence Database Search

2017

Computer architectures continue to develop rapidly towards massively parallel and heterogeneous systems. Thus, easily extensible yet highly efficient parallelization approaches for a variety of platforms are urgently needed. In this paper, we present SWhybrid, a hybrid computing framework for large-scale biological sequence database search on heterogeneous computing environments with multi-core or many-core processing units (PUs) based on the Smith- Waterman (SW) algorithm. To incorporate a diverse set of PUs such as combinations of CPUs, GPUs and Xeon Phis, we abstract them as SIMD vector execution units with different number of lanes. We propose a machine model, associated with a unified …

0301 basic medicineXeonSequence databasebusiness.industryComputer scienceInterface (computing)Symmetric multiprocessor systemParallel computingSet (abstract data type)03 medical and health sciences030104 developmental biologySoftwareComputer architectureSIMDbusinessMassively parallel2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

researchProduct

Optimizing Query Perturbations to Enhance Shape Retrieval

2020

3D Shape retrieval algorithms use shape descriptors to identify shapes in a database that are the most similar to a given key shape, called the query. Many shape descriptors are known but none is perfect. Therefore, the common approach in building 3D Shape retrieval tools is to combine several descriptors with some fusion rule. This article proposes an orthogonal approach. The query is improved with a Genetic Algorithm. The latter makes evolve a population of perturbed copies of the query, called clones. The best clone is the closest to its closest shapes in the database, for a given shape descriptor. Experimental results show that improving the query also improves the precision and complet…

050101 languages & linguisticsComputer scienceInformationSystems_INFORMATIONSTORAGEANDRETRIEVALPopulationComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION02 engineering and technology[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]Search engineCompleteness (order theory)Genetic algorithm0202 electrical engineering electronic engineering information engineering0501 psychology and cognitive sciences[INFO]Computer Science [cs]educationMassively parallelComputingMilieux_MISCELLANEOUSThesaurus (information retrieval)education.field_of_studyCloning (programming)business.industry05 social sciencesPattern recognitionKey (cryptography)020201 artificial intelligence & image processingArtificial intelligencebusiness

researchProduct

Why Cortices ? Neural Computation in the Vertebrate Visual System

1989

We propose three high level structural principles of neural networks in the vertebrate visual cortex and discuss some of their computational implications for early vision: a) Lamination, average axonal and dendritic domains, and intrinsic feedback determine the spatio-temporal interactions in cortical processing. Possible applications of the resulting filters include continuous motion perception and the direct measurement of high-level parameters of image flow, b) Retinotopic mapping is an emergent property of massively parallel connections. With a local intrinsic operation in the target area, mapping combines to a space-variant image processing system as would be useful in the analysis of …

Artificial neural networkComputer sciencebusiness.industryProperty (programming)Optical flowPattern recognitionImage processingVisual cortexmedicine.anatomical_structureModels of neural computationmedicineMotion perceptionArtificial intelligencebusinessMassively parallel

researchProduct

CRiSPy-CUDA: Computing Species Richness in 16S rRNA Pyrosequencing Datasets with CUDA

2011

Pyrosequencing technologies are frequently used for sequencing the 16S rRNA marker gene for metagenomic studies of microbial communities. Computing a pairwise genetic distance matrix from the produced reads is an important but highly time consuming task. In this paper, we present a parallelized tool (called CRiSPy) for scalable pairwise genetic distance matrix computation and clustering that is based on the processing pipeline of the popular ESPRIT software package. To achieve high computational efficiency, we have designed massively parallel CUDA algorithms for pairwise k-mer distance and pairwise genetic distance computation. We have also implemented a memory-efficient sparse matrix clust…

CUDADistance matrixComputer scienceMetagenomicsPipeline (computing)Pairwise comparisonParallel computingCluster analysisQuantitative Biology::GenomicsMassively parallelSparse matrix

researchProduct

Computational Methods for Gene Expression Profiling Using Next-Generation Sequencing (RNA-Seq)

2014

Cancer genome sequencingMassive parallel sequencingSingle cell sequencingComputational biologyBiologyBioinformaticsDeep sequencingExome sequencingDNA sequencingIllumina dye sequencingMassively parallel signature sequencing

researchProduct